-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
Support Anthropic API /v1/messages Endpoint #22627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for the Anthropic Messages API by adding a new API server, protocol definitions, and a serving layer for format conversion. The implementation is based on the existing OpenAI-compatible server. My review has identified several critical and high-severity issues, including a potential NoneType
access error, incorrect Pydantic model usage that could lead to validation errors, a risk of generating duplicate tool call IDs, and another case of incorrect attribute access on a Pydantic model that would cause a runtime error. I have provided specific code suggestions to address these issues and ensure the stability and correctness of the new endpoint.
return messages(raw_request).create_error_response( | ||
message="The model does not support Chat Completions API") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When handler
is None
, messages(raw_request)
is also None
. Calling create_error_response
on a None
object will raise an AttributeError
, causing an unhandled exception and a 500 server error. You should construct an ErrorResponse
directly to ensure a proper error is returned. You will need to import ErrorResponse
from vllm.entrypoints.openai.protocol
and HTTPStatus
from http
.
return messages(raw_request).create_error_response( | |
message="The model does not support Chat Completions API") | |
return ErrorResponse(message="The model does not support Chat Completions API", | |
type="model_not_found", | |
code=HTTPStatus.NOT_FOUND.value) |
req.tool_choice = ChatCompletionNamedToolChoiceParam.model_validate({ | ||
"type": "function", | ||
"function": { | ||
"name": anthropic_request.tool_choice.get("name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anthropic_request.tool_choice
is a Pydantic model instance, not a dictionary. Accessing its attributes should be done with dot notation (e.g., .name
). Using .get("name")
will result in an AttributeError
at runtime.
"name": anthropic_request.tool_choice.get("name") | |
"name": anthropic_request.tool_choice.name |
|
||
class AnthropicMessagesResponse(BaseModel): | ||
"""Anthropic Messages API response""" | ||
id: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The id
field is defined as a required field. The model_post_init
method, which attempts to set a default value, is called after Pydantic's validation. If id
is not provided during initialization, a ValidationError
will be raised before model_post_init
can execute. To correctly provide a default value for an optional field, you should use default_factory
in the field definition and remove the model_post_init
method.
id: str | |
id: str = Field(default_factory=lambda: f"msg_{int(time.time() * 1000)}") |
elif block.type == "tool_use": | ||
# Convert tool use to function call format | ||
tool_call = { | ||
"id": block.id or f"call_{int(time.time())}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using int(time.time())
to generate tool call IDs is not safe as it can produce duplicate IDs for tool calls created in the same second. This can lead to incorrect behavior when matching tool calls to their results. It's better to use a UUID-based approach for uniqueness. You can use random_tool_call_id
from vllm.entrypoints.chat_utils
for this, which needs to be imported.
"id": block.id or f"call_{int(time.time())}", | |
"id": block.id or random_tool_call_id(), |
content: List[AnthropicContentBlock] = [ | ||
AnthropicContentBlock( | ||
type="text", | ||
text=generator.choices[0].message.content | ||
) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A text AnthropicContentBlock
is created even if generator.choices[0].message.content
is None
. This can lead to an invalid content block, as the Anthropic API requires the text
field for text blocks. When serialized with exclude_none=True
, this would result in an invalid content block. You should only create the text content block if there is content available.
content: List[AnthropicContentBlock] = [ | |
AnthropicContentBlock( | |
type="text", | |
text=generator.choices[0].message.content | |
) | |
] | |
content: List[AnthropicContentBlock] = [] | |
if generator.choices[0].message.content: | |
content.append( | |
AnthropicContentBlock( | |
type="text", | |
text=generator.choices[0].message.content)) | |
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Exciting! Make sure to add some unit tests before ready and I wonder if we can also include a smoke test that claude code/some other application can communicate with the API correctly |
Thanks @LiuLi1998! Would you also be willing to help with ongoing support/maintenance of the API? |
Thanks for the input! I agree — I’ll add some tests soon to make sure everything works as expected. |
Definitely! I’m glad to take part in the support/maintenance of the API |
Signed-off-by: liuli <[email protected]>
Signed-off-by: liuli <[email protected]>
4465c0f
to
82851a3
Compare
Signed-off-by: liuli <[email protected]>
I've added initial tests.I'm not entirely sure if the current approach follows best practices or covers everything needed—would really appreciate your feedback on improvements or any other cases |
Signed-off-by: liuli <[email protected]>
This PR adds support for the Anthropic /v1/messages REST API endpoint to the vLLM FastAPI server.